Skip to content

Conversation

srini-abhiram
Copy link
Contributor

Replaces manual similarity calculation and query-based retrieval in FindSimilar with Milvus's Search API for more efficient and accurate similarity search. Updates index creation to use the new HNSW index API. Improves cache hit/miss logic and error handling.

What type of PR is this?
refactor(FindSimilar): Migrate to Milvus for similarity search

What this PR does / why we need it:
This PR refactors the FindSimilar functionality to use the Milvus vector database for similarity search, replacing the previous manual calculation and query-based retrieval logic.

Key changes include:
Adopting Milvus Search API: All similarity search operations now leverage Milvus's native Search API, which is highly optimized for performance and accuracy.

HNSW Indexing: The index creation process has been updated to use the new HNSW (Hierarchical Navigable Small World) index API, which provides faster and more accurate search results for large-scale vector data.

Code Improvements: The caching logic has been streamlined, and error handling for interactions with the Milvus service has been made more robust.

This migration was necessary to improve the efficiency, scalability, and accuracy of our similarity search feature, reducing the maintenance overhead of the custom-built solution using Go.

Which issue(s) this PR fixes:
Fixes #150

Release Notes: No

Copy link

netlify bot commented Oct 6, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 2ea24ed
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68e4bac6568b47000856bb0a
😎 Deploy Preview https://deploy-preview-352--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@srini-abhiram
Copy link
Contributor Author

If the code changes are fine, I can add a integration test for milvus cache. Please advice if my code is incorrect, Im open to criticism.

@rootfs
Copy link
Collaborator

rootfs commented Oct 6, 2025

@srini-abhiram this is cool! can you sign the DCO

In your local branch, run: git rebase HEAD~1 --signoff
Force push your changes to overwrite the branch: git push --force-with-lease origin issue-150

Replaces manual similarity calculation and query-based retrieval in FindSimilar with Milvus's Search API for more efficient and accurate similarity search. Updates index creation to use the new HNSW index API. Improves cache hit/miss logic and error handling.

Signed-off-by: Srinivas A <[email protected]>
@srini-abhiram
Copy link
Contributor Author

@rootfs I have followed your instructions and signed the commit.

Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks!

Copy link

github-actions bot commented Oct 7, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/milvus_cache.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@Xunzhuo Xunzhuo merged commit 256c305 into vllm-project:main Oct 7, 2025
8 checks passed
@srini-abhiram
Copy link
Contributor Author

@rootfs I haven't added the integration test case for milvus Search, I am working on it. Should I create a seperate PR when I'm done?

@Xunzhuo
Copy link
Member

Xunzhuo commented Oct 7, 2025

sure, plz go ahead in a separate PR, thanks! @srini-abhiram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Efficient use of Milvus for caching
4 participants